Skip to content

feat(metrics): emit hoodie.table.version as gauge on client init#19108

Draft
shangxinli wants to merge 3 commits into
apache:masterfrom
shangxinli:shangx/oss-metrics-table-version
Draft

feat(metrics): emit hoodie.table.version as gauge on client init#19108
shangxinli wants to merge 3 commits into
apache:masterfrom
shangxinli:shangx/oss-metrics-table-version

Conversation

@shangxinli

@shangxinli shangxinli commented Jun 29, 2026

Copy link
Copy Markdown
Contributor

Describe the issue this Pull Request addresses

The metrics surface does not expose the on-disk Hudi table version. Operators upgrading across major Hudi versions — particularly anyone running with hoodie.write.auto.upgrade=false and a pinned hoodie.write.table.version — currently have no observability signal to alert when a table is unexpectedly promoted to a newer table version. This change fills that gap with a one-shot gauge emitted on every write / table-service client init, so standard alerting (Prometheus / Datadog / Graphite / JMX) can flag the drift in near real time.

Summary and Changelog

  • Adds HoodieMetrics#emitTableVersionMetric(int) which registers a table.tableVersion gauge in the metrics registry.
  • BaseHoodieClient calls it once per client construction after loading the table's metaClient.
  • Best-effort: if the table has not yet been initialized (first-ever write to a new base path) the metaClient load throws and we log at DEBUG and skip; the gauge will surface on the next client init after the table exists.
  • No-op when metrics are disabled (config.isMetricsOn() == false).
  • Adds unit tests in TestHoodieMetrics covering both the enabled and disabled paths.

Impact

Pure additive change. No on-disk state, no schema change, no behavior change for tables that disable metrics. Existing callers unaffected.

Gauge name: <metric.prefix>.<tableName>.table.tableVersion
Value: integer HoodieTableVersion#versionCode() (e.g. 6, 8)

Risk Level

none — additive metric only.

Documentation Update

none — surfaced through the existing reporter infrastructure; no new config.

Contributor's checklist

  • Read through contributor's guide
  • Enough context is provided in the sections above
  • Adequate tests were added if applicable

Adds HoodieMetrics#emitTableVersionMetric(int) which registers a
"table.tableVersion" gauge in the metrics registry. BaseHoodieClient
calls it once per client construction after loading the table's
metaClient.

Motivation
----------
The existing metrics surface does not expose the on-disk Hudi table
version at all. Operators upgrading across major Hudi versions
(particularly anyone running with hoodie.write.auto.upgrade=false
and a pinned hoodie.write.table.version) currently have no
observability signal to alert when a table is unexpectedly promoted
to a newer table version by an upgrade path. This change fills the
gap with a single one-shot gauge that fires on every write /
table-service client init, so standard alerting
(Prometheus / Datadog / Graphite / JMX) can flag the drift in near
real time.

Behavior
--------
- Gauge name: <metric.prefix>.<tableName>.table.tableVersion
- Value: integer HoodieTableVersion#versionCode() (e.g. 6, 8)
- Emitted at the end of BaseHoodieClient construction, once.
- Best-effort: if the table has not yet been initialized
  (first-ever write to a new base path), the metaClient load throws
  and we log at DEBUG and skip; the gauge will surface on the next
  client init after the table exists.
- No-op when metrics are disabled (config.isMetricsOn() == false).

Test plan
---------
Added two unit tests to TestHoodieMetrics:
- testEmitTableVersionMetric: asserts the gauge is registered with
  the supplied version code.
- testEmitTableVersionMetricWhenMetricsDisabled: asserts no-op when
  metrics are off.

Compatibility
-------------
Pure additive change. No on-disk state, no schema change, no
behavior change for tables that disable metrics. Existing callers
unaffected.
@github-actions github-actions Bot added the size:S PR with lines of changes in (10, 100] label Jun 29, 2026
@shangxinli shangxinli changed the title [HUDI-XXXXX] Emit hoodie.table.version as gauge metric on client init feat(metrics): emit hoodie.table.version as gauge on client init Jun 29, 2026
@hudi-bot

Copy link
Copy Markdown
Collaborator

CI report:

Bot commands @hudi-bot supports the following commands:
  • @hudi-bot run azure re-run the last Azure build

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

size:S PR with lines of changes in (10, 100]

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants